feat!: integrate all vision models with vision camera by NorbertKlockiewicz · Pull Request #880 · software-mansion/react-native-executorch

NorbertKlockiewicz · 2026-02-25T15:20:28Z

Description

Added PixelData input support to all vision model hooks and modules — forward() now accepts both string URIs and raw pixel buffers
Added runOnFrame worklet to all vision model hooks for real-time VisionCamera v5 frame processing (useClassification, useImageEmbeddings, useOCR, useVerticalOCR, useObjectDetection, useSemanticSegmentation, useStyleTransfer)
Refactored StyleTransfer C++ to unified generateFromString / generateFromPixels with a saveToFile flag
Added new VisionCamera integration docs page with setup guide, full example, Module API pattern, and common issues section
Updated all vision model hook and module docs to document PixelData input and runOnFrame
Added integration tests: VisionModelTest (unit tests for extractFromPixels and preprocess), FrameProcessorTest (unit tests for pixelsToMat), StyleTransferTest (full coverage of both output modes), and generateFromPixels smoke tests across all vision models

Breaking changes

`StyleTransferType.forward` return type changed

Before:

  forward(imageSource: string): Promise<string>

After:

  forward(input: string | PixelData, output?: 'pixelData' | 'url'): Promise<PixelData | string>

Known Issues

The orientation handling is static right now, it will be fixed in the next PR.

Introduces a breaking change?

Yes
No

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Documentation update (improves or adds clarity to existing documentation)
Other (chores, tests, code style improvements etc.)

Tested on

iOS
Android

Testing instructions

Run computer vision app -> vision camera screen -> test all models
Also test rest of the models in this app.
Run test suite

Screenshots

Related issues

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly
My changes generate no new warnings

Additional notes

barhanc · 2026-03-13T11:42:03Z

On Android the output for front camera is upside down and mirrored. Is this an expected behaviour that will be fixed with orientation handling?

NorbertKlockiewicz · 2026-03-13T11:43:40Z

On Android the output for front camera is upside down and mirrored. Is this an expected behaviour that will be fixed with orientation handling?

yes, I am working on it now but this PR already has too much changes so it will be added in the next one.

msluszniak · 2026-03-13T12:26:35Z

+  EXPECT_NO_THROW((void)model.generateFromString(kValidTestImagePath, false));
+  auto result1 = model.generateFromString(kValidTestImagePath, false);
+  auto result2 = model.generateFromString(kValidTestImagePath, false);
+  ASSERT_TRUE(std::holds_alternative<PixelDataResult>(result1));
+  ASSERT_TRUE(std::holds_alternative<PixelDataResult>(result2));
+  EXPECT_NE(std::get<PixelDataResult>(result1).dataPtr, nullptr);
+  EXPECT_NE(std::get<PixelDataResult>(result2).dataPtr, nullptr);


Why these are repeated?

msluszniak · 2026-03-13T12:28:05Z

+TEST(StyleTransferPixelTests, ValidPixelsSaveToFileTrueReturnsString) {
+  StyleTransfer model(kValidStyleTransferModelPath, nullptr);
+  std::vector<uint8_t> buf;
+  auto view = makeRgbView(buf, 64, 64);
+  auto result = model.generateFromPixels(view, true);
+  EXPECT_TRUE(std::holds_alternative<std::string>(result));
+}
+
+TEST(StyleTransferPixelTests, ValidPixelsSaveToFileTrueHasFileScheme) {
+  StyleTransfer model(kValidStyleTransferModelPath, nullptr);
+  std::vector<uint8_t> buf;
+  auto view = makeRgbView(buf, 64, 64);
+  auto result = model.generateFromPixels(view, true);
+  ASSERT_TRUE(std::holds_alternative<std::string>(result));
+  EXPECT_TRUE(std::get<std::string>(result).starts_with("file://"));
+}


The first test is a subset of the second one

msluszniak · 2026-03-13T12:29:38Z

+TEST(PixelsToMatValidInput, ProducesCorrectRows) {
+  std::vector<uint8_t> buf;
+  auto view = makeValidView(buf, 48, 64);
+  EXPECT_EQ(pixelsToMat(view).rows, 48);
+}
+
+TEST(PixelsToMatValidInput, ProducesCorrectCols) {
+  std::vector<uint8_t> buf;
+  auto view = makeValidView(buf, 48, 64);
+  EXPECT_EQ(pixelsToMat(view).cols, 64);
+}


Merge this two into one test

msluszniak · 2026-03-13T12:30:06Z

+TEST(PixelsToMatValidInput, ProducesThreeChannelMat) {
+  std::vector<uint8_t> buf;
+  auto view = makeValidView(buf, 4, 4);
+  EXPECT_EQ(pixelsToMat(view).channels(), 3);
+}
+
+TEST(PixelsToMatValidInput, MatTypeIsCV_8UC3) {
+  std::vector<uint8_t> buf;
+  auto view = makeValidView(buf, 4, 4);
+  EXPECT_EQ(pixelsToMat(view).type(), CV_8UC3);
+}


Again, merge and make two check in one test

msluszniak · 2026-03-13T15:08:41Z

That might look better:

msluszniak · 2026-03-13T15:22:54Z

Also get these warnings when running demo apps:

 WARN  Route "./vision_camera/tasks/types.ts" is missing the required default export. Ensure a React component is exported as default.
 WARN  Route "./vision_camera/utils/colors.ts" is missing the required default export. Ensure a React component is exported as default.

msluszniak · 2026-03-13T15:31:06Z

Trying to run regular style transfer freezes the app on my android phone.

This is probably related to: #962 so we can ignore it for now.

chmjkb

left a couple of comments for now, though I haven't finished reviewing :D

chmjkb · 2026-03-13T14:50:20Z

+## VisionCamera integration
+
+For real-time object detection on camera frames, use `runOnFrame`. It runs synchronously on the JS worklet thread and returns `Detection[]`.
+
+See the full guide: [VisionCamera Integration](./visioncamera-integration.md).


Honestly I think just a link to the vision camera integration doc would be enough. This way we don't have to remember to update these docs when changing return types. It's a general comment to all the changes in docs.

chmjkb · 2026-03-13T15:01:51Z

When downloading the model i got the maximum state depth update error

oh, on android or ios, also when downloading which model?

iOS and I believe it was the selfie segmentation one

I've tried to reproduce it in iPhone 16 Pro and SE 3 and wasn't able to, as it's just the example app I suggest to "jeździć obserwować"

chmjkb · 2026-03-13T15:13:43Z

+  sizesArray.setValueAtIndex(runtime, 2, jsi::Value(4));
+  obj.setProperty(runtime, "sizes", sizesArray);
+
+  obj.setProperty(runtime, "scalarType", jsi::Value(0));


use ScalarType::Byte instead of plain magic 0

chmjkb · 2026-03-13T17:30:16Z

      },
    ],
-    'camelcase': 'error',
+    'camelcase': ['error', { properties: 'never' }],


Why are we ignoring this rule for properties?

chmjkb · 2026-03-13T17:37:50Z

-  forward: (imageSource: string) => Promise<string>;
+  forward<O extends 'pixelData' | 'url' = 'pixelData'>(
+    input: string | PixelData,
+    output?: O


maybe outputKind / outputType would be a bit better

chmjkb · 2026-03-16T08:54:17Z

+   *
+   * **Note**: For VisionCamera frame processing, use `runOnFrame` instead.
+   *
+   * @param input - Image source (string or PixelData object)


I think you can link to pixeldata here

chmjkb · 2026-03-16T08:55:32Z

+   *
+   * Available after model is loaded (`isReady: true`).
+   *
+   * @example


Other models dont have such example. I think we should follow a single approach

chmjkb · 2026-03-16T09:54:48Z

+/**
+ * Given a model configs record (mapping model names to `{ labelMap }`) and a
+ * type `T` (either a model name key or a raw {@link LabelEnum}), resolves to
+ * the label map for that model or `T` itself.
+ *
+ * @internal
+ */
+export type ResolveLabels<
+  T,
+  Configs extends Record<string, { labelMap: LabelEnum }>,
+> = T extends keyof Configs
+  ? Configs[T]['labelMap']
+  : T extends LabelEnum
+    ? T
+    : never;
+


i think to keep types seperate this should be moved to some other file

chmjkb · 2026-03-16T09:58:56Z

-  private constructor(nativeModule: unknown) {
-    super();
-    this.nativeModule = nativeModule;
-  }


Is there a reason why we're ditching this? I feel like this is a cleaner way of creating models, where the factory is responsible for creating the native object, and the module just receives a ready to use thing.

I've selected wrong version in rebase :/

chmjkb · 2026-03-16T10:15:22Z

+      const instance = new ImageEmbeddingsModule();
+      instance.nativeModule = await global.loadImageEmbeddings(paths[0]);
+      return instance;


same here regarding the construction method

chmjkb · 2026-03-16T10:26:54Z

When can this happen? Maybe we should throw if it does?

if (!this.nativeModule?.generateFromFrame) { return null; }

It would skip inference if the model isn't loaded, however I think we can just check if (this.nativeModule == null)

My point is that skipping inference silently here might be unexpected for the user.

chmjkb · 2026-03-16T10:27:10Z

+    if (!this.nativeModule?.generateFromFrame) {
+      return null;
+    }


Same concern as in VisionModule

chmjkb · 2026-03-16T10:49:37Z

+TEST(StyleTransferThreadSafetyTests, TwoConcurrentGeneratesDoNotCrash) {
+  StyleTransfer model(kValidStyleTransferModelPath, nullptr);
+  std::atomic<int32_t> successCount{0};
+  std::atomic<int32_t> exceptionCount{0};
+
+  auto task = [&]() {
+    try {
+      (void)model.generateFromString(kValidTestImagePath, false);
+      successCount++;
+    } catch (const RnExecutorchError &) {
+      exceptionCount++;
+    }
+  };
+
+  std::thread a(task);
+  std::thread b(task);
+  a.join();
+  b.join();
+
+  EXPECT_EQ(successCount + exceptionCount, 2);
+}
+
+TEST(StyleTransferThreadSafetyTests,


I feel like this could be a typed test for all the models

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…t of navigation

…ions

…meProcessor.cpp Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>

msluszniak · 2026-03-16T17:15:41Z

+  auto sizesArray = jsi::Array(runtime, 3);
+  sizesArray.setValueAtIndex(runtime, 0, jsi::Value(result.height));
+  sizesArray.setValueAtIndex(runtime, 1, jsi::Value(result.width));
+  sizesArray.setValueAtIndex(runtime, 2, jsi::Value(4));


What is 4 here?

It's a number of channels but it should be passed the same way as the height and width, on it now!

dismiss due to work leave

NorbertKlockiewicz self-assigned this Feb 25, 2026

NorbertKlockiewicz linked an issue Feb 25, 2026 that may be closed by this pull request

Vision Camera integration - other vision models #829

Closed

NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch from 66e358e to 0a8493b Compare February 25, 2026 15:36

NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch from 787ea7d to 7488857 Compare March 10, 2026 10:50

NorbertKlockiewicz changed the title ~~@nk/vision models camera integration~~ feat: integrate all vision models with vision camera Mar 11, 2026

NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch from 35c2a3f to a8d95a8 Compare March 11, 2026 13:10

NorbertKlockiewicz added 3rd party package Issue related to 3rd party packages, but not ExecuTorch, e.g. Expo feature PRs that implement a new feature labels Mar 12, 2026

NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch from af8ebfe to 951ab91 Compare March 12, 2026 15:42

NorbertKlockiewicz linked an issue Mar 12, 2026 that may be closed by this pull request

Vision camera integration - documentation #827

Closed

NorbertKlockiewicz changed the title ~~feat: integrate all vision models with vision camera~~ feat(style trasfer)!: integrate all vision models with vision camera Mar 13, 2026

NorbertKlockiewicz changed the title ~~feat(style trasfer)!: integrate all vision models with vision camera~~ feat(style transfer)!: integrate all vision models with vision camera Mar 13, 2026

NorbertKlockiewicz marked this pull request as ready for review March 13, 2026 11:03

NorbertKlockiewicz requested review from benITo47, chmjkb and msluszniak March 13, 2026 11:03

NorbertKlockiewicz changed the title ~~feat(style transfer)!: integrate all vision models with vision camera~~ feat!: integrate all vision models with vision camera Mar 13, 2026

msluszniak reviewed Mar 13, 2026

View reviewed changes

Comment thread ...ages/react-native-executorch/common/rnexecutorch/models/embeddings/image/ImageEmbeddings.cpp

NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch from e07a54c to 190a19b Compare March 13, 2026 13:46

chmjkb requested changes Mar 13, 2026

View reviewed changes

mkopcins reviewed Mar 16, 2026

View reviewed changes

Comment thread packages/react-native-executorch/common/rnexecutorch/host_objects/ModelHostObject.h

chmjkb previously requested changes Mar 16, 2026

View reviewed changes

NorbertKlockiewicz requested review from chmjkb and msluszniak March 16, 2026 13:02

NorbertKlockiewicz and others added 23 commits March 16, 2026 16:53

docs: update documentation

c18ebad

docs: update docs link

5c50292

chore: tests, docs, comments etc.

35411bf

docs: update vision camera docs page

35b722b

refactor: unused include

43b1295

refactor: extract vision camera color utils

312b45a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor: add ClassificationTask component

035dbab

refactor: add ObjectDetectionTask component

c68d909

refactor: add SegmentationTask component

1267ec4

refactor: simplify vision camera screen to shell + task components

eb8cccf

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor: remove comments

baff8f5

fix: after rebase

8e55c12

refactor: batch 1 suggestions

61eaf79

tests: apply merging suggestsions

090e95b

fix: move the vision camera components so they are not treated as par…

3a5f564

…t of navigation

fix: style transfer crashes app, rename output -> outputType

ac75541

reafactor: use ScalarType enum instead of magic number in jsi convers…

c075d8b

…ions

docs: remove a comment about vision camera integration

33721b4

chore: use camelCase ids for model in vision camera demo

81b490a

Update packages/react-native-executorch/common/rnexecutorch/utils/Fra…

648a045

…meProcessor.cpp Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>

feat: requested changes

0a0c664

tests: create typed tests for vision models concurrent generates

c5bf1fd

refactor: follow declaration order in VisionModel class

836c7a2

NorbertKlockiewicz force-pushed the @nk/vision-models-camera-integration branch from b28847b to 836c7a2 Compare March 16, 2026 15:53

msluszniak reviewed Mar 16, 2026

View reviewed changes

fix: replace magic number 4 with channels field in PixelDataResult

3ea1a50

mkopcins approved these changes Mar 17, 2026

View reviewed changes

NorbertKlockiewicz merged commit 4279ad4 into main Mar 17, 2026
5 checks passed

NorbertKlockiewicz deleted the @nk/vision-models-camera-integration branch March 17, 2026 09:18

Conversation

NorbertKlockiewicz commented Feb 25, 2026 • edited by msluszniak Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Breaking changes

StyleTransferType.forward return type changed

Known Issues

Introduces a breaking change?

Type of change

Tested on

Testing instructions

Screenshots

Related issues

Checklist

Additional notes

Uh oh!

barhanc commented Mar 13, 2026

Uh oh!

NorbertKlockiewicz commented Mar 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

msluszniak commented Mar 13, 2026

Uh oh!

msluszniak commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

msluszniak commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chmjkb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chmjkb Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

NorbertKlockiewicz commented Feb 25, 2026 •

edited by msluszniak

Loading

`StyleTransferType.forward` return type changed

msluszniak commented Mar 13, 2026 •

edited

Loading

msluszniak commented Mar 13, 2026 •

edited

Loading

chmjkb Mar 13, 2026 •

edited

Loading